Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive multimodal fusion by uncertainty compensation

While the accuracy of feature measurements heavily depends on changing environmental conditions, studying the consequences of this fact in pattern recognition tasks has received relatively little attention to date. In this work we explicitly take into account feature measurement uncertainty and we show how classification rules should be adjusted to compensate for its effects. Our approach is pa...

متن کامل

A Multimodal Approach to Audiovisual Text-to-Speech Synthesis

Oral speech has always been the most important means of communication between humans. When a message is conveyed using oral speech, it is encoded in two separate signals: an auditory speech signal and a visual speech signal. The auditory speech signal consists of a series of speech sounds that are produced by the human speech production system. In order to generate different sounds, the paramet...

متن کامل

End-to-end Audiovisual Speech Recognition

Several end-to-end deep learning approaches have been recently presented which extract either audio or visual features from the input images or audio signals and perform speech recognition. However, research on end-to-end audiovisual models is very limited. In this work, we present an end-toend audiovisual model based on residual networks and Bidirectional Gated Recurrent Units (BGRUs). To the ...

متن کامل

Audiovisual Speech Synchrony Measure: Application to Biometrics

Speech is a means of communication which is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech, and more specifically techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, trans...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Audio, Speech, and Language Processing

سال: 2009

ISSN: 1558-7916

DOI: 10.1109/tasl.2008.2011515